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EXTREME MITOCHONDRIAL SEQUENCE DIVERSITY IN THE INTERMEDIATE 
SCHISTOSOMIASIS HOST ONCOMELANIA HUPENSIS ROBERTSONI: 
ANOTHER CASE OF ANCESTRAL POLYMORPHISM? 


Thomas Wilke'*, George M. Davis?, Dongchuan Qiu’? & Robert C. Spear’ 


ABSTRACT 


Today, the human blood fluke, Schistosoma japonicum, is transmitted in China by two 
subspecies of the rissooidean snail taxon Oncomelania hupensis: O. h. hupensis and O. 
h. robertsoni. Whereas the eastern Chinese subspecies O. h. hupensis has been studied 
extensively using mitochondrial DNA sequences, very little data existsfor the western sub- 
species O. h. robertsoni. Preliminary phylogeographic studies indicate that the latter shows 
a very high degree of genetic diversity with Kimura 2 parameter distances in the cyto- 
chrome oxidase | (COI) gene of up to 0.0932 (= 9.32%) among four sequences previously 
deposited in GenBank. Extreme degrees of intraspecific heterogeneity in gastropods have 
been reported before, and possible explanations include the presence of cryptic species 
complexes, isolation followed by secondary contact, heteroplasmy and duplications within 
the mitochondrial genome, the presence of “pseudogenes”, and the retention of ancestral 
mitochondrial polymorphism. 

Given the great significance of understanding phylogeographic patterns in the interme- 
diate schistosomiasis host Oncomelania h. robertsoni for comprehending host/parasite 
relationships, DNA sequences of two mitochondrial genes (COI and LSU rRNA) from 66 
O. hupensis robertsoni specimens are used to (1) assess the phylogenetic position, (2) 
study the degree of heterogeneity within and between “populations”, (3) provide a prelimi- 
nary overview of the geographic distribution of major genetic groups and (4) study the 
phylogenetic concordance of the two gene fragments. 

Phylogenetic analyses, parametric bootstrapping and studies of sequence polymorphism 
show that: (1) all CO! sequences are fully protein-coding with no insertions or deletions, 
(2) both individual and combined analyses of the COI and LSU rRNA genes show at least 
four distinct haplotype groups within O. h. robertsoni, (3) monophyly of the four clades 
cannot be confirmed, (4) there is high concordance in cluster patterns and arrangement of 
individual haplotypes of both gene fragments, (5) two of the genetic clades recovered 
appear to be localized, whereas the other two are widely distributed, and (6) sympatry of 
individuals belonging to different clades occurs. Moreover, based on preliminary AFLP 
analyses it could be shown that (7) there is no phylogenetic concordance between the 
mitochondrial and nuclear data presented here, and (8) the nuclear data from AFLP 
genotyping indicate a lack of clear population structure. 

Given the results of the present study, it is cautiously suggested that retention of ances- 
tral mitochondrial DNA polymorphism possibly in combination with some effects of sec- 
ondary contact (introgression) is the most probable explanation for the occurrence of deviant 
lineages in O. h. robertsoni. On the basis of nuclear, morphological, and ecological data, it 
is also suggested that there is no evidence of organismal subdivision in O. h. robertsoni. It 
is strongly recommended that future studies incorporate more data from nuclear loci in 
order to better understand phylogeography, population genetics, and host-parasite co- 
evolution in O. h. robertsoni. 
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INTRODUCTION 


The human blood fluke, Schistosoma japoni- 
cum, responsible for one of the most serious 
disease problems in China, schistosomiasis, 
uses small dioecious rissooidean gastropods 
of the species Oncomelania hupensis as in- 
termediate hosts. Molecular and morphologi- 
cal analyses, together with breeding 
experiments and biogeographic studies of O. 
hupensis, indicate that there are three subspe- 
cies on the mainland of China (Davis, 1992; 
Davis et al., 1995, 1999). Oncomelania h. 
robertsoni is restricted to high elevations on 
the plateaus and mountains of Yunnan and 
Sichuan above the Three Gorges. Oncomela- 
nia h. hupensis is found throughout the Yangtze 
River drainage below the Three Gorges; it has 
spread to Guangxi Province probably via the 
Grand Canal from Hunan. Oncomelania h. 
tangi is restricted to Fujian Province along the 
coast. The latter subspecies has been eradi- 
cated except for two known populations, and 
the parasite presumably is extinct. 

Of the two wide-spread Chinese subspecies 
O. h. hupensis and O. h. robertsoni, the former 
has received considerable attention in genetic 
studies (allozymes and mitochondrial gene 
sequences) dealing with questions of popula- 
tion structure, phylogeography, infectivity and 
the nature of shell ribbing (e.g., Davis et al., 
1995; Wilke et al., 2000a; Shi et al., 2002). 

In contrast, very little is known about the ge- 
netics of the western Chinese subspecies O. 
h. robertsoni. In fact, whereas as of June 2005, 
140 nucleotide sequences are available for O. 
h. hupensis from GenBank, only ten sequences 
(from a total of four specimens) exist for O. h. 
robertsoni. However, preliminary studies in a 
phylogeographic framework of other O. 
hupensis subspecies indicated a very high 
degree of genetic diversity within the few mito- 
chondrial sequences available for O. h. 
robertsoni. In fact, of the four sequences pub- 
lished for the mitochondrial cytochrome c oxi- 
dase subunit I (COI) gene (GenBank accession 
numbers AF213339, AF253075, AF253076, 
AF531547), two sequences (AF253075 and 
AF 2133339) differ by K2P (Kimura 2 parameter) 
distances of 0.0932. To give a comparison, the 
highest pairwise K2P distance among more 
than 100 COI sequences for the eastern Chi- 
nese subspecies O. h. hupensis (which is re- 
garded as genetically highly diverse) is with 
0.0340 (GenBank accession numbers 
AF254484 and AF254509) only about 36% as 
high as in O. h. robertsoni. Moreover, in many 
phylogenetic studies of rissooidean gastro- 


pods, K2P distances in the COI gene compared 

to the amount found in O. h. robertsoni typi- 

cally reflect species, if not genus level relation- 

ships (e.g., Wilke et al., 2000b; Wilke, 2003). 

To complicate matters, in further studies involv- 

ing a single population of O. h. robertsoni from 

the lower Anning River Valley in Sichuan (site 

A8, see below), we even found pairwise K2P 

divergences of up to 0.1027 within the site. 

Extreme degrees of intraspecific mitochon- 
drial heterogeneity in gastropods have been 
reported before, and potential explanations 
involve, among others, the presence of cryptic 
species complexes, isolation followed by sec- 
ondary contact, heteroplasmy and duplications 
within the mitochondrial genome, the presence 
of nuclear “pseudogenes”, or the retention of 
ancestral mitochondrial polymorphism. 

In order to shed light on the problem of het- 
erogeneity within Oncomelania h. robertsoni, 
we here use mitochondrial DNA (mtDNA) se- 
quences from a larger data set of 66 specimens 
from 13 sites. In addition to the protein-coding 
COI gene, we study the mitochondrial gene for 
large subunit ribosomal RNA (LSU rRNA) to 
test for potential conflicts between these gene 
fragments that could help to reveal method- 
ological problems. 

The specific goals of this paper are: 

(1) to assess the phylogenetic position of On- 
comelania h. robertsoni within the frame- 
work of other O. hupensis ssp., 

(2) to study the degree of mitochondrial het- 
erogeneity within and between “popula- 
tions” of O. h. robertsoni, 

(3) to provide a preliminary overview of the 
geographic distribution of major mtDNA 
groups within O. h. robertsoni, and 

(4) to study the phylogenetic concordance of 
different mitochondrial gene fragments. 

We also use the results of preliminary AFLP 
(amplified fragment lengths polymorphism) 
genotyping of highly variable nuclear loci from 
a subset of 24 specimens to discuss the high 
degree of mtDNA diversity in the light of 
nuclear data (for a review of the performance 
of AFLP data in animal population genetics see 
Bensch & Akesson, 2005). 


MATERIALS AND METHODS 
Specimens Studied 
The current study includes 66 specimens of 
Oncomelania hupensis robertsoni Bartsch, 


1946, from 13 sites in Yunnan and Sichuan 
provinces, China (Table 1, Appendix). 
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TABLE 1: Locality information for Chinese specimens of Oncomelania hupensis robertsoni studied 
(M = Meishan Area, A = Anning River Valley, Y = Yunnan). 


Locality Original Latitude No. specimens 
code locality # Longitude Locality studied 
M 1 D 98.16 29.99163 °N Sichuan, Danling County, Ernming 2 
103.41580 °E Township, Xiaoqiao Village 
M2 D 98.14 30.13993 °N Sichuan, Dongpo County, Panao 10 
103:6 1157 E Township, Magau Group 2 Village 
M3 MG 96.18 30.0373 °N Sichuan, Meishan County, Fusheng 5 
103.9002 °E Township, Zhongfu Village 
M4 - 30.067 °N Sichuan, Mianzhu Af 
104.136: -E 
A1 D 98.04 27.93525 °N Sichuan, Xichang County, Xixiang 4 
102.20540 °E Township, Gucheng Village 
A2 D 98.03 27.9318 °N Sichuan, Xichang County, Xxixiang 4 
102.1962 °E Township, Gucheng Village 
A3 D 98.12 ~ 27.8 7505 N Sichuan, Xichang County, Chaunxing 4 
102.30867 °E Township, Minhe group 2 Village 
A4 D 98.09 27.8000 °N Sichuan, Xichang County, Jingjiu 4 
102.204 °E Township, Zhoutun Village 
A5 D 98.07 27.7995 °N Sichuan, Xichang County, Hainan 4 
102.3087 SE Township, Gucheng group 2 Village 
A6 D 98.05 27.7973 °N Sichuan, Xichang County, Hainan 4 
10Z scr -E Township, Gucheng group 5 Village 
A7 D 98.11 27.7468 °N Sichuan, Xichang County, Jingjiu 4 
102.1903 °E Township, Jingjiu Village 
A8 Xi Chang** 26.9637 °N Sichuan, Miyi County, Panlian 12 
102.1328 °E Township, Shuanggou Village 
Yal Dali** 25.4510 °N Yunnan, Dali City, Da Jin Ping, Zi 8 
100.2007 °E Ran Village 


* from GenBank (Attwood et al., 2003) 


** previously studied using allozyme electrophoresis by Davis et al. (1995) 


As primary outgroup taxon (which was used 
to root the mtDNA trees) served a yet unde- 
scribed representative of the genus Tricula 
(Tricula sp.; Davis et al., 1998) (GenBank 
AF213341, AF212895). Like Oncomelania, 
Tricula belongs to the family Pomatiopsidae. 
Additional outgroup taxa used in the current 
study are Oncomelania minima Bartsch, 1936 
(GenBank DQ212795, DQ212858), as well as 
four other subspecies of O. hupensis: O. h. 
hupensis (Gredler, 1881) (GenBank AF254547, 
DQ212859), O. h. tangi (Bartsch, 1936) (Gen- 
Bank DQ212796, DQ212860), O. h. formosana 
(Pilsbry & Hirasé, 1905) (GenBank DQ112283, 
DQ212861), and O. h. quadrasi (Moellendorff, 
1895) (GenBank DQ112287, DQ212862). 


DNA Isolation and Sequencing 


The method used for isolating DNA from snails 
was modified from that of Spolsky et al. (1996). 


Individual alcohol-preserved specimens were 
first soaked for 10 min in 1 ml ice-cold exchange 
buffer (0.02 M Tris base, 0.1 M EDTA, pH 8.0). 
Then, either the soft body of a whole specimens 
or part of the foot (depending on the size of the 
specimen) was cut in pieces and incubated 
overnight in a water bath at 58°C in 200 ul 
Turner lysis buffer (0.02 M Tris base, 0.1 M 
EDTA, 0.5% Sarkosyl, pH 8.0) and 3 ul of 20 
ug/ul Proteinase K. After digestion, 35 ul of 5 M 
NaCl and 35 ul of a5% CTAB/0.5 M NaCl solu- 
tion were added. Extraction was carried out with 
270 ul chloroform. After centrifugation for 5 min 
at 9,000 rpm, the aqueous phase was trans- 
ferred into a new tube and 270 ul of CTAB pre- 
cipitation buffer (1% CTAB, 0.05 M Tris base, 
0.01 M EDTA) was added, mixed and placed 
at room temperature for 45 min. After pelleting 
the CTAB-DNA for 10 min at 12,000 rpm, the 
supernatant was disposed and the pellet redis- 
solved in 100 ul of NaCl/TE (0.01 Tris base, 
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0.001 M EDTA, 1 M NaCl, pH 8.0) and 1 ul of 
10 mg/ml RNase. After incubation for 8 min at 
65°C, the DNA was precipitated over night at 
-20°C by adding 250 ul of ice-cold 96% etha- 
nol. After centrifugation for 15 min at 12,000 
rpm, the pellet was washed twice with 300 ul of 
ice-cold 70% ethanol, air-dried for 5—10 min and 
finally redissolved in approximately 50 ul H,O. 
Quality and quantity of the isolated genomic 
DNA were checked on a 1% agarose gel. 

The primers used to amplify a fragment of 
the COI gene with a target length of 658 base 
pairs (excluding 51 bp primer sequence) were 
LCO1490 and HCO2198 as described by 
Folmer et al. (1994). The primers for amplifica- 
tion of aLSU rRNA fragment with a target length 
of 505-508 bp (excluding 42 bp primer se- 
quence) were 16Sar-L and 16Sbr-H of Palumbi 
et al. (1991). Sequences (forward and reverse) 
were determined using the LI-COR (Lincoln, 
NE) DNA sequencer Long ReadIR 4200 and 
the Thermo Sequenase Fluorescent Labeled 
Primer Cycle Sequencing kit (Amersham 
Pharmacia Biotech, Piscataway, NJ). 

The COI sequences were aligned unambigu- 
ously by eye using BioEdit 5.0.9 (Hall, 1999). 
All sequences are fully protein-coding with no 
insertions or deletions. However, the first few 
base pairs (bp) behind the 3’ end of each 
primer were difficult to read. We therefore uni- 
formly cut off the first and last ten bp of each 
sequence, leaving a 638 bp-long completely 


TABLE 2: AFLP primers. 


overlapping fragment for the COI gene. Align- 
ment of LSU rRNA sequences was done using 
ClustalX (version 1.81; Thompson et al., 1997). 
No manual refinement was necessary as the 
alignment yielded only five gaps: three single- 
nucleotide gaps as well as one gap of up to 
two nucleotides and one gap of up to three 
nucleotides within a stretch of thymine bases. 
The total length of the aligned LSU rRNA is 510 
bp. All sequences are available from GenBank 
(for GenBank accession numbers and DNA 
voucher numbers see the Appendix). 


AFLP Genotyping 


Genomic DNA was digested with the fre- 
quent cutter restriction enzyme Msel (New 
England Biolab, NEB) and the rare cutter 
EcoRI (NEB). Adaptors (Table 2) were ligated 
to the genomic DNA using T4 ligase (NEB). 
Both digestion and ligation were carried out in 
a single reaction running for 12h at 37°C. 

The ligation product was used to perform a 
pre-selective PCR amplification with NEB Taq 
polymerase (for EcoRI and Msel primers see 
Table 2). The quality of the ligation/pre-ampli- 
fication was checked on a 1% agarose gel. 

Selective amplification was performed from 
1:40 diluted pre-amp DNA as duplex PCR (one 
unlabeled Msel each with the two IRDye-la- 
beled EcoRI primers; Table 2). A total of 12 
primer combination was used for the PCR. 


Primer Sequence 

Adapters 
EcoRI 5'-CTC GTA GAC TGC GTA CC-CAT CTG ACG CAT GGT TAA-3' 
Msel 5'-GAC GAT GAG TCC TGA G-TA CTC AGG ACT CAT-3’ 


Pre-amplification primers 
E01 E-A (EcoRI) 
M02 M-C (Msel) 

Selective amplification primers 
700 E-AAC 
800 E-AAG 
M-CGA 
M-CTT 
M-CTC 
M-CAT 
M-CTA 
M-CTG 


5’-GAC TGC GTA CCA ATT CA-3' 
5'-GAT GAG TCC TGA GTA AA-3' 


5'-IRD700-GAC TGC GTA CCA ATT CAA C-3' 
5'-IRD800-GAC TGC GTA CCA ATT CAA G-3' 
5’-GAT GAG TCC TGA GTA AGG A-3' 
5'-GAT GAG TCC TGA GTA ACT T-3' 
5'-GAT GAG TCC TGA GTA ACT C-3' 
5'-GAT GAG TCC TGA GTA ACA T-3' 
5’-GAT GAG TCC TGA GTA ACT A-3' 
5'-GAT GAG TCC TGA GTA ACT G-3' 
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Labeled PCR products were separated on an 
8% acrylamide gel using the DNA sequencer 
LI-COR Long ReadIR 4200 and digitally cap- 
tured with the software package SAGA Genera- 
tion 2 (MX module version 3.2.1.) from LI-COR. 
We manually selected the most informative and 
consistent bands for analysis. These 102 poly- 
morphic loci were scored for 32 individuals of 
O. hupensis. Samples with > 10% ambiguities 
(more than ten ambiguous values for 102 loci 
scored) were removed from the data set and 
not included in any analyses, thus reducing the 
sample size from 32 to 24 individuals. 


Preliminary Statistical Analyses (mtDNA) 


Possible dissimilarities between the COI and 
LSU rRNA data sets are of primary interest for 
the current study. In order to test whether there 
were significant differences in incongruence- 
length between the COI and LSU rRNA data 
sets, the HOMPART command in PAUP* v. 
4.0b10 (Swofford, 2002) was used to perform 
a partition-homogeneity test (Farris et al., 
1995). As the test did not reveal a significant 
conflict (P = 0.2580; 10,000 replicates), the two 
data sets were used in a combined analysis. 

Given the potential high degree of sequence 
diversity in Oncomelania hupensis robertsoni, 
as indicated by preliminary analyses, we used 
the test of Xia et al. (2003) implemented in 
the software package DAMBE 4.2.13 (Xia & 
Xie, 2001) to test for saturation prior to the 
phylogenetic analyses. The Xia et al. test did 
not reveal a significant degree of saturation 
(lss = 0.301, Iss.c = 0.801, P = 0.0000). 

Nucleotide diversities and divergences (cor- 
rected according to the K2P-parameter-model) 
were calculated using MEGA 2.1 (Kumar et 
al., 2000) with standard errors estimated by 
1,000 bootstrap replications with pairwise de- 
letion of gaps and missing data. 


Phylogenetic Reconstruction (mtDNA) 


The performance of different phylogenetic 
methods is highly controversial, and as numer- 
ous factors such as degree of heterogeneity 
and sample size may affect the quality of phy- 
logenetic reconstruction (e.g., Huelsenbeck, 
1995; Wiens & Servedio, 1998; Kolaczkowski 
& Thornton, 2004), we here use both maxi- 
mum parsimony (MP) and Bayesian inference 
(BI) based methods. 

Phylogenetic analyses based on the MP cri- 
terion were conducted in PAUP* 4.0b10 
(Swofford, 2002) using the heuristic search 
option with tree bisection reconnection branch- 


swapping, 100 replications of random stepwise 
additions, and MAXTREES set to 10,000. 
Node support was evaluated with 10,000 
bootstrapping replications. 

Phylogenetic reconstruction based on BI was 
conducted using the software package 
MrBayes 3.0b4 (Huelsenbeck & Ronquist, 
2001). First, we compared several independent 
runs using the default random tree option to 
monitor the convergence of the -In likelihoods 
of the trees. The -log likelihoods started at 
around —8,100 and converged on a stable value 
of about —4,300 after approximately 60,000 gen- 
erations. We then did a final run using the Me- 
tropolis-coupled Markov chain Monte Carlo 
variant with four chains (one cold, three heated) 
and 1,000,000 sampled generations with the 
current tree saved at intervals of 10 genera- 
tions. A 50% majority rule tree was constructed 
from all sampled trees with the first 10,000 trees 
(100,000 generations) ignored as burn in. 

MP and BI analyses were conducted with 
simple and optimal model of sequence evolu- 
tion (the latter based on the Akaike Informa- 
tion Criterion implemented in Modeltest 3.6; 
Posada & Crandall, 1998), respectively. 


Parametric Bootstrapping (mtDNA) 


A parametric bootstrapping approach was 
used to specifically test the monophyly of On- 
comelania h. robertsoni (for a review of the 
parametric bootstrap see Hillis et al., 1996). 
First we ran Modeltest to find the optimal 
model of sequence evolution for the aligned 
sequences of all O. h. robertsoni haplotypes. 
We then conducted maximum likelihood (ML) 
searches in PAUP* v. 4.0610 under the con- 
straint that O. h. robertsoni is NOT monophyl- 
etic (null hypothesis). The resulting tree was, 
together with the aligned sequences, imported 
into Seq-Gen 1.2.5. (Rambaut & Grassly, 
1997) to generate 100 random data sets 
based on the model suggested by Modeltest. 
We then analyzed in PAUP the differences in 
tree lengths between the constrained and un- 
constrained trees for each of the 100 replicates. 
The frequency of differences in tree lengths 
was plotted and compared to the tree length 
difference (constrained vs. unconstrained) of 
the original unpermutated data set. Finally, we 
estimated how likely it was that this difference 
could have been observed randomly. 


Intraspecific Genomic Polymorphism (AFLP) 


AFLP genotyping is used here in a first attempt 
to study the degree of nuclear polymorphism in 
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O. h. robertsoni on the DNA fingerprint level. 
However, given the limited number of specimens 
used for AFLP genotyping, we restrict our analy- 
ses to estimating diversity indices and to com- 
puting a minimum spanning network (MSN) 
among genotypes in a preliminary assessment 
of genetic structure in our data set. 

A matrix of corrected average pairwise dif- 
ferences between Oncomelania h. hupensis 
and O. h. robertsoni as well as within O. h. 
robertsoni was calculated in Arlequin 2.0 
(Schneider et al., 2000). The matrix was also 
used to construct the MSN for the AFLP 
haplotypes via Arlequin 2.0. 


RESULTS 
MtDNA Sequence Polymorphism 


Among the 66 specimens of Oncomelania 
h. robertsoni studied, a total of 40 haplotypes 
was found for the combined COI/LSU rRNA 
fragments. The average nucleotide diversity 
(corrected to the K2P-model) among all indi- 
viduals of O. h. robertsoni is 0.046 + 0.004 
with a pairwise maximum of 0.117 between 
individuals A8d (Anning River Valley) and M2e 
as well as M2g (both from Meishan Area). 

Within Oncomelania h. robertsoni, we de- 
tected four relatively distinct genetic groups 
(characterized by average genetic divergences 
of > 0.04 and numbered |, lla, Ilb, and Ilc in Table 
3 and Fig. 1). The divergences among these 
groups range from 0.042 + 0.006 (between 
groups lla and IIc) to 0.085 + 0.010 (between 
groups | and IIc). In comparison, the overall level 
of genetic divergence among representatives of 
other Oncomelania hupensis subspecies ranges 
from 0.0097 + 0.0029 (between O. h. hupensis 
and O. h. formosana) to 0.1024 + 0.0103 (be- 
tween O. h. formosana and O. h. quadrasi). 


It should be noted that in phylogeographical 
studies, the evolutionary relationships above 
and below the species level are different in 
nature and their resolution requires a differ- 
ent set of methods (Posada & Crandall, 2001). 
Therefore, many workers use phylogeograph- 
ical tools (e.g., network, population structure 
and gene flow analyses) to infer within-spe- 
cies relationships. However, preliminary tests 
show that the diversity in our data set is too 
high for these analyses. Therefore, we have 
to restrict the following mtDNA analyses to 
standard phylogenetic tests (MP and BI phy- 
logenetic reconstruction as well as paramet- 
ric bootstrapping). 


MtDNA Phylogenetic Analyses 


Given the great significance of data set con- 
gruence for addressing potential problems of 
heteroplasmy and NUMTs, we also performed 
and compared separate phylogenetic analy- 
ses with the individual COI and LSU rRNA data 
sets, despite the fact that the partition-homo- 
geneity test did not reveal significant conflicts. 
Both MP and BI analyses revealed four dis- 
tinct phylogenetic groups of O. hupensis 
robertsoni in the COI and LSU rRNA phylog- 
enies. A manual comparison of the two trees 
showed a high congruence between the clus- 
ter patterns in the COI and LSU rRNA trees, 
that is, in both gene trees, the same speci- 
mens clustered in the same groups (individual 
trees not shown here). However, there were 
differences in the trees relative to the mono- 
phyly of the four groups within O. hupensis 
robertsoni. Whereas MP and BI analyses of 
the COI data set resulted in trees that showed 
the four major groups of O. h. robertsoni to be 
monophyletic, in the LSU rRNA data set the 
four groups were either paraphyletic (BI analy- 
sis: clade | clustered together with the other 


TABLE 3: Average K2P nucleotide divergences between four major ge- 
netic groups of Oncomelania h. robertsoni (below diagonal line) and 
average nucleotide diversities within major groups (diagonal line). For a 
geographic distribution of these groups, see Fig. 1. 


l lla 
l 0.018 + 0.003 
lla 01089;.0.009 0.011 +0.002 
lib 0.085+0.009 0.043 + 0.006 
lic 0.085 + 0.010 0.042 + 0.006 


Iib llc 


0.005 + 0.001 
0.043 + 0.007 0.010 + 0.003 
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four subspecies of O. hupensis) or unresolved 
(MP analysis). 

We then combined the two data sets and per- 
formed several phylogenetic analyses using MP 
and BI. In all combined analyses, we could re- 
cover the four distinct clades of O. h. robertsoni, 
generally with good support values. However, 
both MP and BI could not fully resolve the rela- 
tionships among these clades (similar to the MP 
analysis of the LSU rRNA data set-see above): 
there is a trichotomy of (1) the clade comprising 
the four other subspecies of O. hupensis used 
in the present study, (2) clade | of O. h. robertsoni, 
and (3) a clade composed of sub-clades Ila, IIb, 
and IIc of O. h. robertsoni (Fig. 1). 

The most basal clade in O. h. robertsoni 
(clade |) has a wide geographic distribution. 
Haplotypes belonging to this clade were found 
in all geographic areas sampled in the present 
study, that is, in Yunnan, in the southern Anning 
River Valley, and in eastern Meishan Area. In 
contrast, based on the limited data presented 
here, clade lla appears to be a localized clade 
with haplotypes coming exclusively from locali- 
ties in the northern Anning River Valley. Clade 
lb has a wider distribution, ranging from the 
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southern Anning River Valley to a single local- 
ity in Meishan Area. Finally, clade IIc is a local- 
ized clade restricted to Meishan Area. 
Sympatric specimens belonging to different 
clades were found in two localities: at site A8 
(southern Anning River Valley): of the 12 speci- 
mens studied, nine belong to clade | and three 
to clade Ilb and at site M2 (central Meishan 
Area): from ten specimens studied, three be- 
long to clade llb and seven to clade IIc (Fig. 1). 


Parametric Bootstrapping 


Given the inability to solve the problem of 
Oncomelania h. robertsoni monophyly using 
the phylogenetic methods above, a paramet- 
ric bootstrapping test was performed. 

The alternate hypothesis of non-monophyly 
cannot be rejected (P = 0.41) as the observed 
difference in tree lengths between the con- 
strained and unconstrained tree in the original 
data set is smaller than the observed difference 
in 59% of the simulated data sets (Fig. 2). In 
other words, a tree that has been forced to show 
O. h. robertsoni non-monophyletic is not sig- 
nificantly worse than an unconstrained tree. 
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FIG. 2. Result of the parametric bootstrap analysis for the hypothesis of non-mono- 
phyly of Oncomelania h. robertsoni. The black bars show the number of simulated 
data sets and the corresponding differences in tree lengths between the constrained 
(non-monophyly criterion) and unconstrained trees. The dashed line shows the ob- 
served difference for the original data set. 41 of the 100 sampled data sets have tree 
length differences equal or smaller than in the original data set. Therefore, the null 
hypothesis of non-monophyly of O. h. robertsoni cannot be rejected. 


EXTREME SEQUENCE DIVERSITY IN ONCOMELANIA H. ROBERTSON! US 


AFLP Analysis 


All 24 specimens analyzed had unique AFLP 
fingerprints. Estimates of diversity indices (av- 
erage pairwise differences among all O. h. 
robertsoni genotypes divided by the total num- 
ber of loci scored as well as the average 
pairwise differences between O. h. hupensis 
and O. h. robertsoni divided by the total num- 
ber of loci) resulted in a within O. h. robertsoni 
diversity of 0.113 + 0.045 and in a divergence 
between the two respective subspecies of 
0.214 + 0.034. 

The MSN network (Fig. 3) shows most O. h. 
robertsoni genotypes clustering in a star-like 
pattern. The single O. h. hupensis genotype 
scored is distinct from the O. h. robertsoni 
genotypes. Within O. h. robertsoni, some 
genotypes (e.g., Y1f, M2g, A8g, M2b, and A2b) 
are relatively distinct as well. Given the star- 
like nature of the network, no clear population 
structure is recognizable and specimens from 
the same site do not cluster together in dis- 
tinct groups. Also, the four major mtDNA clades 
found in the present study (Fig. 1) are not re- 
flected in the AFLP data. 


Oncomelania 
h. hupensis 


DISCUSSION 


Given the great significance of phylogeo- 
graphic patterns in the intermediate schisto- 
somiasis host Oncomelania h. robertsoni for 
understanding host/parasite relationships, 
there are several interesting findings in our 
study that potentially can help to shed some 
light on our observation of high rates of mtDNA 
sequence divergences within Oncomelania h. 
robertsoni: 

(1) all COI sequences are fully protein-coding 
with no insertions or deletions; 

(2) both individual and combined analyses of 
the mtDNA CO! and LSU rRNA genes 
show four distinct haplotype groups within 
the subspecies of interest (note that given 
our still preliminary sampling design, it is 
well possible that more haplotype groups 
will be recovered in future studies); 

(3) neither the phylogenetic analyses nor the 
parametric bootstrapping test performed 
here are conclusive relative to the monophyly 
of the four O. h. robertsoni clades found; 

(4) both the partition-homogeneity test and vi- 
sual inspections of the individual COI and 


Oncomelania_ h. robertsoni 
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FIG. 3. Minimum spanning network for observed AFLP genotypes of Oncomelania h. robertsoni (large 
white circles) based on 102 polymorphic loci. For comparison, a specimen of O. h. hupensis (large 
black circle) was included. Small black circles indicate the scored differences between the haplotypes. 
For individual codes see the Appendix. 
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LSU rRNA trees revealed high concor- 
dance in cluster patters and arrangement 
of individual mtDNA haplotypes; 

(5) two of the mtDNA clades recovered ap- 
pear to be localized, whereas two are 
widely distributed; 

(6) sympatry of individuals belonging to differ- 
ent mtDNA clades does occur; 

(7) there is no phylogenetic concordance be- 
tween the mitochondrial and preliminary 
nuclear data presented here; and 

(8) the nuclear data from AFLP genotyping in- 
dicate a lack of clear population structure 
in O. h. robertsoni. 

Based on these results, we will focus in our 
discussion on the high rates of intraspecific 
mtDNA variability and discuss some of the 
explanations found in the literature and their 
relevance for the O. h. robertsoni problem. 


Presence of a Cryptic Species Complex 


The presence of cryptic species radiations 
has been reported for many mollusc groups, 
though morphostasis seems to be particularly 
common in rissooidean gastropods (e.g., Pon- 
der et al., 1995; Hershler et al., 1999; Wilke & 
Pfenninger, 2002). In fact, several studies on 
snail hosts in SE Asia revealed cryptic radia- 
tions in the family Pomatiopsidae (e.g., Davis, 
1992: Attwood & Johnston, 2001). However, 
the taxon of concern in the present paper, 
Oncomelania hupensis, is one of the morpho- 
logically and ecologically best studied snail 
taxa in Southeast Asia. Particularly the three 
Chinese subspecies (O. h. hupensis, O. h. 
robertsoni, and O. h. tangi) were subject to 
extensive shell morphological and quantitative 
anatomical studies and comparative anatomi- 
cal analyses did not reveal significant differ- 
ences within these subspecies. In fact, a 
comparative anatomical study of O. h. robert- 
soni populations from Yunnan and Sichuan 
provinces showed that they are anatomically 
undistinguishable (George Davis, unpublished 
data). Moreover, an allozyme study (Davis et 
al., 1995) of the same three populations of O. 
h. robertsoni from Sichuan and Yunnan prov- 
inces showed low levels of heterogeneity 
within and between populations that are not 
indicative of a marked departure from the other 
subspecies. In fact, the allozyme heterogene- 
ity within O. h. robertsoni was lower than in 
the eastern Chinese subspecies O. h. hupen- 
sis. This lack of population structure within O. 
h. robertsoni could be confirmed in our AFLP 
study. Given these findings, the presence of a 


cryptic taxon complex within O. h. robertsoni 
can very likely be ruled out as a cause for the 
high degree of mtDNA diversity within this sub- 
species. 


Duplications within the Mitochondrial Genome 


Duplications of genes or gene fragments 
within the mitogenome involving protein-cod- 
ing genes are most often explained with the 
mechanism of tandem duplication of gene re- 
gions as a result of slipped strand mispairing, 
followed by the deletions of genes (Inoue et 
al., 2003, and references therein). Most dupli- 
cations involve short fragments where control 
regions and tRNA genes seem to be particu- 
larly prone to mispairing but there are also 
reported cases of duplication portions > 8 kbp 
(e.g., Moritz & Brown, 1987; Inoue et al., 2003). 

If duplications of mtDNA genes were respon- 
sible for the observed mtDNA patterns in O. 
h. robertsoni, then this would involve a large 
portion of the mtDNA genome containing both 
COI and LSU rRNA genes. While this does 
not seem to be impossible (see above), it is 
not very likely as this explanation requires a 
high number of assumptions. 


Presence of Nuclear Mitochondrial DNA 
(NUMT or “Pseudogenes’”) 


Nuclear copies of mitochondrial genes, so- 
called nuclear mitochondrial DNA (NUMT) or 
“pseudogenes” (e.g., Lopez et al., 1994; Ben- 
sasson et al., 2001) have been observed in 
many animal species and if unnoticed, can 
severely confound phylogenetic and popula- 
tion genetic studies (Zhang & Hewitt, 1996). 
According to Bensasson et al. (2001), symp- 
toms of NUMT contamination of mtDNA can 
include: (A) PCR ghost bands, (B) sequence 
ambiguities (e.g., if encountered in forward and 
reverse strands), (C) frame shift mutations, 
and (D) stop codons. None of these symptoms 
were observed in the sequence data gener- 
ated for the present study i.e., there were no 
ghost bands in the PCR products, there were 
no relevant alignment conflicts in forward and 
reverse strands, there were no insertions or 
deletions in the alignment of the protein-cod- 
ing COI gene, and the gene portion studied 
was free of stop codons. Moreover, as the in- 
dividual COI and LSU rRNA phylogenies are 
concordant, both genes would have had to 
move simultaneously into the nuclear genome. 
Given all these facts, we can rule out the pres- 
ence of NUMTs in our data sets. 
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Heteroplasmy 


Heteroplasmy, that is, the presence of more 
than one type of mtDNA in males/female or 
even in the same organism, has been reported 
from several invertebrate species (e.g. Zouros 
etal., 1992; Fujino et al., 1995; van Herwerden 
et al., 2000; Steel et al., 2000). 

The mitochondrial genome is usually inher- 
ited maternally, but paternal ‘leakage’ and/or 
biparental inheritance patterns are common in 
some groups (e.g., Mytilus; Hoeh et al., 1991). 
In addition to biparental inheritance, animal 
mitochondrial heteroplasmy can also be 
caused by mutation of the genome within the 
individual or within the original oocyte (Steel et 
al., 2000). In most cases, heteroplasmy only 
involves variations in the number for repeats 
within the mitochondrial control region (e.g., 
Hoarau et al., 2002) but base substitutions in 
coding genes also have been found (e.g., van 
Herwerden et al., 2000; Steel et al., 2000). 
While paternal or biparental inheritance can not 
be completely dismissed as cause for the mi- 
tochondrial diversity observed in O. h. robert- 
soni, it is not likely as, for example, all 28 
specimens studied from sites A1—7 belong to 
the same clade. Another possible scenario 
would be the existence of distinct populations 
of mitochondria due to non-concerted evolu- 
tion. Fujino et al. (1995) and van Herwerden 
et al. (2000) found heteroplasmy in the COI 
and ND1 genes, respectively, of several dige- 
netic trematode species. The workers sug- 
gested that structurally different forms of 
mitochondria are present in the tegumental and 
parenchymal cells of adults. Given the life style 
of the amphibious Oncomelania h. robertsoni, 
that is, the ability to closely shut the shell with 
its operculum to avoid dehydration during dry 
conditions and the fact that some freshwater 
snails (including snail hosts for schistosomia- 
sis) have been shown to be capable of switch- 
ing between aerobic and anaerobic respiration 
(Jurberg et al., 1997; van Hellemond et al., 
1995, 2003), structurally different types of mi- 
tochondria associated with different metabolic 
respiratory processes could, at least in theory, 
exist in Oncomelania h. robertsoni. However, 
as the full phylogenetic concordance of CO! 
and LSU rDNA haplotypes (based on the com- 
parison of the individual CO! and LSU rDNA 
trees) does not support the existence of more 
than one type of mitochondrion in a single in- 
dividual, non-concerted evolution appears to 
be extremely unlikely as well. 


Temporal Isolation Followed by Secondary 
Contact 


An increasing number of studies shows that 
temporal isolation followed by secondary con- 
tact has deeply influenced the phylogeography 
of many Palearctic species (e.g. Taberlet et 
al., 1998; Hewitt, 2000). Particularly, pro- 
cesses resulting from fragmentation into gla- 
cial refuges followed by range expansions via 
postglacial colonization routes may lead to 
secondary contact zones among formerly dis- 
jointed lineages (e.g., Pfenninger & Posada, 
2002). Pleistocene glaciations and climate 
changes certainly must have affected the riv- 
ers and streams of the area that is currently 
populated by O. h. robertsoni. However, we 
doubt that these phylogeographic processes 
alone are responsible for the extant mtDNA 
patterns seen today. The divergence between 
the major clades of O. h. robertsoni with K2P 
differences of up to 8.5% for the combined 
COI/LSR rDNA data set are indicative of much 
older divergence times than late Pleistocene 
or Holocene. Wilke (2003) suggested an av- 
erage COI local clock rate of 1.83 + 0.21% 
uncorrected distance/my for Protostomia lin- 
eages that are not affected by saturation. 
Given an uncorrected average pairwise COI 
distance of 8.7% between clades | and II in 
our analysis (Fig. 1), the oldest split in O. h. 
robertsoni is potentially some 4 my old (i.e., 
early Pliocene) and predates the split of all 
other O. hupensis subspecies. Secondary con- 
tact of formerly isolated population may there- 
fore not fully explain the patterns observed 
here, particularly as there is no compelling 
supporting evidence from our AFLP data or 
previous allozyme studies conducted by Davis 
et al. (1995). 


Retention of Ancestral mtDNA Polymorphism 


The conflict between our mtDNA und nuclear 
data sets combined with the potentially long 
age of the O. h. robertsoni clades, as dis- 
cussed above, may be indicative of a problem 
in some mtDNA analyses: retained ancestral 
polymorphism. 

A mtDNA phylogeny represents a gene tree 
that may not be congruent with the species tree 
(i.e., no reciprocal monophyly in the descen- 
dant taxa) because of the retention of ances- 
tral lineages due to stochastic processes (e.g., 
Avise, 2000; Moore, 1995). This is particularly 
true for species with ancient divergences 
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(Avise, 2000) and the problem cannot be 
solved using multiple mtDNA genes, as the 
animal mitochondrial genome is inherited as a 
single unit. Therefore, phylogenies derived 
from multiple mtDNA genes are not indepen- 
dent estimates of a species’ phylogeny (Moore, 
1995; Page, 2000). 

Long-term substantial isolation among popu- 
lations of O. h. robertsoni could have disrupted 
gene flow and therefore allowed the retention 
of anciently separated matrilines. As pointed 
out by Avise (2000), the evolutionary continu- 
ance of isolated populations may buffer 
against the extinction of lineages within a spe- 
cies. However, this would not explain the oc- 
currence of different matrilines in sympatry as 
seen in sites M2 and A8 (Fig. 1). Perhaps there 
is secondary contact among these lines after 
all (i.e., introgression), either due to post-Pleis- 
tocene range expansions or human impact 
(like transport of snails or their eggs with rice 
plants). However, as our AFLP data (and pre- 
vious allozyme and morphological and eco- 
logical data) do not support the mtDNA 
matrilines, we suggest that there is no evi- 
dence of organismal subdivision in O. h. 
robertsoni (for a very similar case involving 
Drosphila simulans: Ballard et al., 2002). 

It is beyond the scope of this paper to dis- 
cuss the distinct selective forces acting on the 
mitochondrial and nuclear genomes. However, 
tests for deviation from a strictly neutral model 
of evolution in our mtDNA data sets based on 
Fu and Li's D* and F* (Fu & Li, 1993) as imple- 
mented in DnaSP 3.53 (J. Rozas & R. Rozas, 
1999) showed that the COI data set deviates 
significantly from expectations under neutral- 
ity both in Fu and Li's D* (1.78, P < 0.02) and 
in Fu and Li’s F* (1.80, P < 0.05). Neutrality 
was not rejected in the (smaller) LSU rDNA 
data set with values of 0.56 (P > 0.10) and 0.57 
(P > 0.10) for Fu and Li’s D* and F*, respec- 
tively. At least the results for the COI data set 
suggest that selection and/or population level 
processes like expansion, contraction, or sub- 
division (Ballard & Whitlock, 2004) are acting 
upon the mtDNA in O. h. robertsoni. 

Interestingly, one of the extrinsic forces that 
has been shown to influence mtDNA evolution 
in natural populations are parasites (e.g., Turelli 
& Hoffmann, 1995; Ballard et al., 2002). 
Whether, the parasite of O. h. robertsoni, Schis- 
tosoma sp., has a similar effect on the mtDNA 
evolution of its host would need to be tested in 
future studies. 

In the present paper, we offer DNA data from 
two mitochondrial gene fragments as well as 
preliminary data from AFLP genotyping as a 


first step to assess the problem of deviant lin- 
eages in O. h. robertsoni. We suggest that the 
presence of a cryptic species complex or the 
occurrence of NUMTs are unlikely to explain 
the phylogeographic patterns observed. 
Though, we cannot completely dismiss the 
occurrence of heteroplasmy or duplications 
within the mitochondrial genome, which have 
been observed in molluscs before, these ex- 
planations are unlikely as well. The most prob- 
able scenario is the retention of ancestral 
mtDNA polymorphism possibly in combination 
with some effects of secondary contact. Based 
on our preliminary AFLP data, we also sug- 
gest that there is no evidence of organismal 
subdivision in O. h. robertsoni. However, these 
hypotheses need to be tested thoroughly in 
future study. 

Nevertheless, we find it important to present 
our preliminary findings in order to draw at- 
tention to the problem observed. As interme- 
diate host for schistosomiasis in western 
China, Oncomelania h. robertsoni is receiving 
growing attention in ecological and parasito- 
logical studies. It is strongly suggested that 
future studies incorporate more data from 
nuclear loci in order to better understand 
phylogeography, population genetics and host- 
parasite co-evolution in O. h. robertsoni. 
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EXTREME SEQUENCE DIVERSITY IN ONCOMELANIA H. ROBERTSON! Wor 
APPENDIX 
Individual codes (M = Meishan Area, A = Anning River Valley, Y = Yunnan), DNA voucher num- 


bers and GenBank accession numbers for Chinese specimens of Oncomelania hupensis robertsoni 
studied. 


Individual DNA GenBank accession # Individual DNA GenBank accession # 


code voucher #** COI/LSU rRNA code voucher #** COI/LSU rRNA 
Mia 0963 DQ212797/- A4d 0958 DQ212828/DQ212883 
M1b 0965 DQ212798/- Ada 0951 DQ212829/DQ212884 
M2a 0941 DQ212799/DQ212863 A5b 0952 DQ212830/DQ212885 
M2b 0942 DQ212800/DQ212864 A5c 0953 DQ212831/DQ212886 
M2c 0943 DQ212801/DQ212865 A5d 0954 DQ212832/- 
M2d 1022 DQ212802/- A6a 0947 DQ212833/DQ212887 
M2e 1388 DQ212803/DQ212866 A6b 0948 DQ212834/- 
M2f 1389 DQ212804/DQ212867 A6c 0949 DQ212835/- 
M2g 1390 DQ212805/DQ212868 A6d 0950 DQ212836/DQ212888 
M2h 1391 DQ212806/DQ212869 A7a 0937 DQ212837/DQ212889 
M2i 1392 DQ212807/DQ212870 A7b 0938 DQ212838/- 
M2) 1393 DQ212808/DQ21287 1 A7c 0939 DQ212839/DQ212890 
M3a MG14 DQ212809/- A7d 1021 DQ212840/- 
M3b MG15 DQ212810/- A8a 0019 DQ212841/- 
M3c MG16 DQ21281 1/- A8b 0020 DQ212842/DQ212891 
M3d MG30 DQ212812/- A8c 0021 DQ212843/DQ212892 
M3e MG33 DQ212813/- A8d 0022 DQ212844/DQ212893 
M4a -* AF531547/AF531545* A8e 0023 DQ212845/- 
Ala 0932 DQ212814/- A8f 0026 DQ212846/DQ212894 
A1b 0934 AF213339/AF 212893 A8g 0028 DQ212847/DQ212895 
Aic 0935 DQ212815/- A8h 0029 DQ212848/DQ212896 
Aid 1018 DQ212816/- ABi 0030 DQ212849/- 
A2a 0928 DQ212817/DQ212872 A8j 0050 DQ112252/- 
A2b 0929 DQ212818/DQ212873 A8k 0051 DQ212850/DQ212897 
A2c 0930 DQ212819/DQ212874 A8I 0057 DQ212851/DQ212898 
A2d 0931 DQ212820/DQ212875 Yta 0045 AF253074/DQ212899 
A3a 0959 DQ212821/DQ212876 YAtib 0046 DQ212852/- 
A3b 0960 DQ212822/DQ212877 Yic 0048 AF253075/- 
A3c 0961 DQ212823/DQ212878 Yid 0055 DQ212853/- 
A3d 0962 DQ212824/DQ212879 Y1e 0066 DQ212854/- 
A4a 0955 DQ212825/DQ212880 Y1f 1505 DQ212855/DQ212900 
A4b 0956 DQ212826/DQ212881 Y1g 1506 DQ212856/DQ212901 


A4c 0957 DQ212827/DQ212882 Y1h 1508 DQ212857/DQ212902 


* from Attwood et al. (2003) 
** deposited at the DNA voucher collection of the Justus Liebig University, Giessen 


